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We study the causal effect of winning an Oscar Award on an actor 
or actress's survival. Does the increase in social rank from a performer 
winning an Oscar increase the performer's life expectancy? Previous 
studies of this issue have suffered from healthy performer survivor 
bias, that is, candidates who are healthier will be able to act in more 
films and have more chance to win Oscar Awards. To correct this 
bias, we adapt Robins' rank preserving structural accelerated failure 
time model and p-estimation method. We show in simulation studies 
that this approach corrects the bias contained in previous studies. 
We estimate that the effect of winning an Oscar Award on survival is 
4.2 years, with a 95% confidence interval of [—0.4, 8.4] years. There is 
not strong evidence that winning an Oscar increases life expectancy. 

1. Introduction. Does an increase in a social animal's social "rank" cause 
tlie animal to live longer? This question has been studied extensively in both 
nonhuman primates and humans. Animals with social ranks that experi- 
ence more stress have been shown to experience adverse adrenocortical, car- 
diovascular, reproductive, immunological, and neurobiological consequences 
[Sapolsky (2005)]. Redelmeier and Singh (2001) studied the impact of so- 
cial rank on lifetime in an intriguing context: among Hollywood actors and 
actresses, does winning an Oscar Award (Academy Award) cause the ac- 
tor's/actress's expected lifetime to increase? In Redelmeier and Singh's most 
emphasized comparison (the one cited in their abstract), they stated that 
life expectancy was 3.9 years longer for Oscar Award winners than for other 
less recognized performers and that this difference corresponded to a 28% 
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mortality rate reduction for winners compared to less recognized perform- 
ers (95% CI: 10% to 42%). In an interview, Dr. Redelmeier stated, "Once 
you've got that statuette on your mantel place, it's an uncontested sign of 
peer approval that nobody can take away from you, so that any subsequent 
harsh reviews leave you more resilient. It doesn't quite get under your skin. 
The normal stresses and strains of everyday life do not drag you down." 
[Associated Press Story, February 26 (2005)]. 

In Redelmeier and Singh's analysis emphasized in their abstract, they fit 
a Cox proportional hazards model with whether a performer ever wins an 
Oscar Award in his or her lifetime treated as a time-independent covariate 
and survival measured from the performer's date of birth. Sylvestre, Huszti 
and Hanley (2006) pointed out that this analysis suffers from immortal time 
bias — for a winner, the time before winning is "immortal time." In other 
words, performers who live longer have more opportunities to win Oscar 
Awards. To eliminate immortal time bias, Sylvestre et al. fit a Cox pro- 
portional hazards model with winning status treated as a time-dependent 
covariate and survival measured from a performer's date of first nomination 
(Redelmeier and Singh also fit one time-dependent covariate model with 
survival measured from the performer's date of birth). Sylvestre et al. esti- 
mated that winning an Oscar Award had a positive effect on lifetime, but 
the estimated effect was not significant. Although a valuable step forward, 
Sylvestre et al.'s analysis still suffers from healthy performer survivor bias: 
Candidates who are healthier will be able to act in more films and have 
more chances to win Oscar Awards. We provide a more detailed description 
of healthy performer survivor bias in Sections 2 and 3. 

In this paper we adapt James Robins' rank preserving structural accel- 
erated failure time model with (^-estimation [Robins (1992); Robins et al. 
(1992)] to eliminate healthy performer survivor bias; it also eliminates im- 
mortal time bias, which can be seen as one aspect of healthy performer 
survivor bias. Our analysis is based on the assumption that the winner of 
each award is selected randomly among the nominees conditional on age at 
time of nomination, number of previous nominations and number of pre- 
vious wins. We first show in a simulation study the potential for healthy 
performer survivor bias to make inferences from Cox models, with or with- 
out time-dependent covariates, incorrect, and then show that (7-estimation 
provides correct inferences. We then analyze the effect of winning an Oscar 
on life expectancy using (7-estimation. 

Our study also contributes to the debate that high socio-economic sta- 
tus is associated with good health and long life. Famous examples are the 
Whitehall studies of British civil servants; see Reid et al. (1974), Marmot, 
Rose and Hamilton (1978), Marmot, Shipley and Rose (1984), Marmot et 
al. (1991) and Ferrie et al. (2002). Recently, Rablen and Oswald (2008) 
studied the causal effect of winning a Nobel Prize on scientists' longevity. 
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Correcting for potential bias, they estimated that winning the Nobel Prize, 
compared to merely being nominated, is associated with between 1 and 2 
years of extra longevity. Abel and Kruger (2005) studied the longevity of 
Baseball Hall of Famers compared to the other players. They concluded that 
median post-induction survival for Hall of Famers was 5 years shorter than 
for noninducted players, which does not support the role of celebrity on 
longevity. 

The rest of our paper is organized as follows: Section 2 discusses previous 
methods and their biases and presents a simulation study that documents 
these biases. Section 3 describes the rank preserving structural failure time 
model and (7-estimation, Section 4 analyzes the Oscar Award data and Sec- 
tion 5 provides conclusion and discussion. 

2. Existing methods and biases. 

2.1. Background for Oscar Awards. The Oscar Awards are the most 
prominent and most watched film awards ceremony in the world. They are 
presented annually by the Academy of Motion Pictures Arts and Sciences. 
We will focus on the awards in four categories — Best Lead Actor, Best Lead 
Actress, Best Supporting Actor, and Best Supporting Actress. The annual 
awards selection process is complex, but the brief schedule is as follows: In 
December, the Academy compiles a list of eligible performers for an award. 
In January, all Academy members nominate five performers in each of the 
four categories (Best Lead Actor, Best Lead Actress, Best Supporting Ac- 
tor, Best Supporting Actress). In February, nominations for each performer 
are tabulated, and the top five are publicly identified as nominees for each 
category. Then all Academy members vote for one out of five nominees, and 
the winner is the one who gets the most votes. 

2.2. Previous work. Redelmeier and Singh (2001) compiled a list of all 
nominees for the Oscar Awards from 1929 to 2000 (72 years). They also 
matched each nominee to a cast member who performed in the same film as 
the nominee and was the same sex and born in the same era as the nominee. 
Redelmeier and Singh's analysis was based on comparing 235 Oscar winners 
to 527 nonwinning nominees, and 887 performers who were never nominated 
(controls). In their primary analysis, survival was measured from perform- 
ers' dates of birth. ^ In most of Redelmeier and Singh's analyses, they used 



^Redelmeier and Singh also considered survival from the day each performer's first film 
was released, each performer's 65th birthday (excluding performers who died before 65), 
and each performer's 50th birthday (excluding performers who died before 50). As noted 
by Sylvestre, Huszti and Hanley (2006), all of these methods of measuring time-zero suffer 
from immortal time bias. 
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the winner status as a fixed-in-time covariate, that is, a performer would 
be considered a winner throughout the study if he or she won an Oscar 
Award at least once in his or her lifetime. Kaplan-Meier curves showed 
that life expectancy was 3.9 years longer for winner than for controls, and 
3.5 years longer for winners than for nonwinning nominees. In Cox propor- 
tional hazards models with no adjustment for other covariates, winning was 
estimated to reduce mortality by 28% compared to controls and by 26% 
compared to nonwinning nominees, with lower 95% confidence limits for 
both comparisons greater than 0%, suggesting that winning an Oscar has 
a beneficial effect on lifetime. Adjustment for demographic and professional 
factors yielded similar results, with lower confidence limits for the mortality 
reduction due to winning remaining above 0%. Redelmeier and Singh con- 
sidered one Cox proportional hazard model that used the winner status as 
a time-dependent covariate, that is, an Oscar Award winning performer is 
treated as a winner only after he or she won an Award. This model estimated 
a mortality rate reduction of 20% for winners vs. controls, with a lower 95% 
CI limit of 0%. 

Sylvestre, Huszti and Hanley (2006) pointed out that analyses that treat 
winner status as a fixed in time covariate credit the winners' lifetime before 
winning toward survival subsequent to winning. These "immortal" years will 
cause bias in the estimate of the causal effect of winning. We will focus on 
Sylvestre et al.'s method for correcting this bias in comparing winners to 
nonwinning nominees. Sylvestre et al. used a Cox proportional hazard model 
that differed in two ways from Redelmeier and Singh's primary analyses: 
(1) winning was treated as a time-dependent covariate, an Oscar Award 
winning performer only becomes a winner after he or she wins an award 
(as noted above, Redelmeier and Singh also considered this approach in 
one of their analyses); (2) a performer was only part of the risk set once 
he or she was first nominated. Using this model, Sylvestre et al. estimated 
a mortality rate reduction of 18% for winners vs. nonwinning nominees with 
a 95% CI of —4% to 35%. Thus, this model estimates that winning an 
Oscar has a beneficial effect on lifetime, but there is not strong evidence 
for a beneficial effect. Note that Sylvestre et al. used an updated data set 
compared to Redelmeier and Singh's; Sylvestre et al. considered a selection 
interval for Oscar Awards from 1929 to 2001 (73 years) with 238 winners and 
528 nonwinning nominees. Sylvestre, Huszti and Hanley (2006) also used the 
survival analysis method suggested by Efron (2002) and did an analysis with 
a binomial logistic regression model. Death in each year of a performer's life 
was treated as a Bernoulli random variable and regressed on covariates such 
as winning status, age of nomination, and calendar year of nomination. This 
model yielded a similar result as Sylvestre et al.'s Cox proportional hazards 
model analysis. The results from previous studies are listed in Table 1. 
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Table 1 
Winners vs. nominees 



Type of analysis 


Status 


Time-zero 


Reduction in 
mortality rate 
(95% CI) (%) 




Static^ 


Birthday 


23 (3 to 39) 


PH 


Dynamic* 


Birthday 


11 (-12 to 30) 


PH 


Dynamic 


Nomination day 


18 (-4 to 35) 


PY2 


Dynamic 


Nomination day 


18 (-4 to 36) 



Notes: These results are based on the updated data set in Sylvestre, Huszti and Hanley 
(2006). The first row is the Cox model without adjustment for any covariates; the second 
row is the Cox model with winning status as a time-dependent covariate and with sex and 
year of birth as time- independent covariates; the third row is the Cox model with the same 
covariates as the second row, but with nomination day as time-zero; the fourth row is the 
binomial logistic regression model with sex, age, and calendar year as covariates. The first 
two rows are from Redelmeier and Singh's analysis (using Sylvestre et al.'s updated data 
set), and the last two rows are from Sylvestre et al.'s analysis. 
^PH stands for Cox proportional hazard model. 

^PY stands for performer years analysis, which is the binomial logistic regression model 
described above. 

''static status treats the winning status as a fixed-in-time covariate. 
''Dynamic status treats the winning status as a time-dependent covariate. 

2.3. Healthy performer survivor bias. Previous studies have suffered from 
healthy performer survivor bias, that is, candidates who are healthier will 
be able to act in more films and have more chances to win Oscar Awards. 

One aspect of healthy performer survivor bias is immortal time bias, that 
is, candidates will have more chances to win Oscar Awards if they live longer. 
When a performer is classified as a winner throughout the study, regardless 
of when the performer wins the award, there are unfair comparisons between 
winners and nonwinning performers who died before the winner won the 
award. As an example, consider Henry Fonda and Dan Dailey, who were 
both first nominated for an Oscar Award at the age of 35 but did not win 
in their first nominations. Fonda first won an Oscar at age 77 and died four 
months after, while Dailey never won an Oscar and died at age 64. Fonda 
lived 13 years beyond the age of Dailey's death before winning an Oscar. 
It is not fair to consider the 13 years before Fonda won his Oscar as being 
affected by winning. 

To correct for immortal time bias, Sylvestre et al. used a Cox proportional 
hazard model with the winning status as a time-dependent covariate. In this 
model, the survival comparison between a winner and a nonwinning nominee 
starts appropriately only at the time the winner wins. 

Although Sylvestre et al.'s analysis was an important advance in that 
it corrects for immortal time bias, it still suffers from other aspects of the 
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healthy performer survivor bias. Winning an Oscar Award is an indicator 
of being healthy. In Sylvestre et al.'s analysis, the risk set at a given age 
consists of those performers who have been nominated by that age. Among 
these performers, those who are healthy at the given age have had more 
opportunities to perform and to win an Oscar. These healthy performers 
are also more likely to live longer. Since having won an Oscar is associated 
with survival in a risk set even if winning has no causal effect on survival, 
there is the potential for bias. 

As an example consider Jack Palance and Arthur O'Connell who were first 
nominated for Oscars but did not win at ages 34 and 48, respectively. Palance 
won an Oscar at age 73, while O'Connell never won an Oscar. Palance was 
an active actor when he was in his 70s, acting in ten films in his 70s, and 
lived to be 87. On the other hand, O'Connell was stricken with Alzheimer's 
disease by the time he turned 70 and by the time of his death at age 73, he 
was appearing solely in toothpaste commercials (www.imdb.com). The fact 
that Palance lived longer than O'Connell in the risk set that started at age 
73 after Palance's first win is not likely due to the effect of winning but to 
the healthy performer survivor bias. 

One way of attempting to control for healthy performer survivor bias is 
to condition on (control for) confounders in the Cox model. In particular, 
nomination history is a confounder because it is a strong risk factor for 
subsequently winning on Oscar Award (indeed, it is necessary) and for mor- 
tality, since sick individuals do not get nominated. Previous studies did not 
condition on nomination history and thus suffered from confounding bias. 

However, even if we condition on nomination history, as well as past age 
and Oscar wins, and there are no other confounders besides these variables, 
the time-dependent Cox model can be biased if Oscar winning affects future 
nominations [Robins (1986, 1992)]. It is substantively plausible that previ- 
ous Oscar winning affects future nomination (even under the stronger null 
hypothesis that neither nomination nor winning affects health). The effect 
could go in either direction. For example, among two subjects with the same 
nomination history, only one of whom won before, the winner would have 
a higher probability of being renominated if increased fame coming from 
previously winning results in an increased chance of nomination per film, all 
else being equal. On the other hand, the winner would have a lower proba- 
bility of being renominated if nominators felt those who have not won before 
are more deserving of a chance to win. 

To understand the bias in the Cox analysis when previous Oscar winning 
affects future nomination, suppose a previous winner has a higher probability 
of being renominated, all else being equal. Then one would expect that 
among the nominees in a given year with same past nomination histories, 
the previous winners would be less healthy than the previous nonwinners, 
since the nonwinners might have had to be in a large number of movies in 
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Table 2 
Performers' age pattern 





Sick age 


Death age 


Group 1 


60 


70 


Group 2 


70 


80 


Group 3 


80 


90 



the previous year to get nominated for one of them, while for the winner it 
often would suffice to be in just one. But only a healthy person could be in 
many movies in one year. Note that this bias persists even if we had data on 
the number of movies performed in each year and adjusted for this variable 
as well as nomination. 

2.4. Simulation studies. To illustrate the potential of previous studies of 
survival in Oscar Award winning performers to suffer from healthy performer 
survivor bias, we conducted a simulation study. 

We first assigned a lifetime for each performer and a time at when the 
performer became sick. Then for each year, we randomly pick nominees 
from performers who are still alive and healthy, and randomly select one 
of them as the winner. Hence, winning an award does not have any effect 
on prolonging performers' lifetime, because lifetime is predetermined before 
deciding who wins the awards. If a method shows an effect of winning over 
repeated simulations from this setting, it is biased. 

For each year between 1830 and 1999, we simulated five performers being 
born. Each performer was randomly selected to have one of the three age 
patterns shown in Table 2. 

For each year from 1927 to 2004, we have one award and we select 5 
nominees from those performers who are still alive and healthy. The details 
are that we select two nominees from the age group 30-39, and one from 
70-79, selecting randomly among healthy performers in those age groups. 
We also select two nominees from the age group 60-69, but with different 
selection probabilities for healthy performers in this age group. For age group 
60-69, the selection weight for a healthy candidate is in Table 3. 



Table 3 
Selection weight for age 60-69 





Previous winner 


Previous nonwinner 


Group 1 








Group 2 


8 


1 


Group 3 


9 


7 



8 



HAN, SMALL, FOSTER AND PATEL 



In this sense, for age group 60-69, winning in the past increases the chance 
to be selected as a nominee, and previous nonwinners tend to be healthier 
than previous winners (i.e., in Group 3 rather than in Group 2). This corre- 
sponds to the fact that previous nonwinners might have had to be in a large 
number of movies in the previous year to get nominated for one of them, 
while for the previous winners it often would suffice to be in just one film to 
get nominated. Consequently, nominated previous winners tend to be less 
healthy than nominated previous nonwinners, because nominated previous 
nonwinners tend to be very healthy to be able to act in many films. 

Nominees from different age groups have a different probability to be 
selected as the winner, with older nominees having a better chance. The 
winning probability also depends on the nomination history and winning 
history. Let I30, leO) and I70 be the indicators of current nomination age 
group 30-39, 60-69, and 70-79, respectively. Let N^q, Nqq, Njq be the num- 
ber of previous nominations in the age group 30-39, 60-69, and 70-79, re- 
spectively. Let W30, Wqo, Wjo be the number of previous wins in the age 
group 30-39, 60-69, and 70-79, respectively. The winning probability for 
each nominee in a given year is calculated as 

P(^, = l|iV,,^) 

= exp(0.5*l|o + 4o + 2*4o 

+ 0.5(iV^o + ^^60 + ^^70 + W^o + W^o + W^o)) 

5 

/j^exp(0.5*1^30 + 1^60 + 2*4o 

+ 0.5(iV|o + < + K + + Wi, + 

We choose these coefficients to magnify the healthy performer survivor bias. 

In our simulation setting, death ages are determined before winning, thus 
winning has no causal effect on lifetime. Therefore, for the null hypothesis 
that there is no treatment effect of winning an Oscar Award on an actor's 
survival, the p- values should be uniformly distributed between and I, and 
the mean of p-values should be around 0.5. If the mean of p-values from 
a method is much smaller than 0.5, then the method is biased. 

The results from 1000 simulations are shown in Table 4 and histograms 
of p- values can be found in Figure 3 of Section 3.5. 

Redelmeier and Singh's results were based on the first two methods in 
Table 4, and Sylvestre et al.'s results were based on the last two methods in 
Table 4. All of these four methods are biased. 

In our simulation setting, past winning history affects future nominations, 
and past nomination history also affects future winning. The previous meth- 
ods did not account for the nomination history in the time-dependent Cox 



SURVIVAL IN OSCAR AWARD WINNING PERFORMERS 9 

Table 4 
Simulation results 



Type of analysis 


Status 


Time-zero 


Mean of p-value 


PH 


Static 


Birthday 


0.03 


PH 


Dynamic 


Birthday 


0.12 


PH 


Dynamic 


Nomination day 


0.12 


PY 


Dynamic 


Nomination day 


0.04 



model. Next we will show that even if one correctly models the effect of 
nomination history on the hazard of death, the hazard model still provides 
biased estimates of the causal effect of winning on survival. 

To simplify the consideration of nomination history and winning history, 
we restrict every candidate to be nominated at most twice and win at most 
twice. Let Dyq and Dso denote death at age 70 and 80, respectively. Let 
^69 and denote survival at age 69 and 79, respectively. Let A^(30) and 
A^(60) denote the numbers of nominations in the age group 30-39 and 60- 
69, respectively. Let ^(30) and A(60) denote the numbers of wins in the age 
group 30-39 and 60-69, respectively. Based on 1000 Monte Carlo simula- 
tions, we obtained estimated mortality hazard rates and corresponding 95% 
confidence intervals for this full model in Tables 5 and 6. 

For a reduced model without winning history, the mortality rates just 
adjusting the nomination history are shown in Table 7. From the above 
probabilities, we can see even though winning has no causal effect on sur- 
vival, winning history affects the hazard of death given nomination history, 
for example, the hazard of dying at 80 for people with one nomination dur- 
ing their 30s and no further nominations is much higher for people who did 
not win an award (0.674) than for those who won one award (0.555). 

Table 5 

Mortality rates for death at 70 conditional on survival to 
69 with nomination history and winning history when 
nomination is affected by past winning history 

Mortality rates 



iV(30) 


A(30) 


Ar(60) 


A(60) 


(95% CI) 


2 


2 








0.355 (0.299, 0.411) 


2 


1 








0.349 (0.332, 0.366) 


2 











0.327 (0.320, 0.334) 


1 


1 








0.508 (0.494, 0.523) 


1 











0.407 (0.404, 0.410) 














0.380 (0.378, 0.381) 
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Table 6 

Mortality rates for death at 80 conditional on survival to 79 with nomination history and 
winning history when nomination is affected by past winning history 



iV(30) 


A{30) 


Ar(60) 


A(60) 


Ar(70) 


A(70) 


Mortality rates 
(95% CI) 


2 


2 














0.472 (0.401, 0.544) 


2 


1 














0.511 (0.489, 0.534) 


2 

















0.493 (0.484, 0.502) 


1 


1 














0.555 (0.533, 0.577) 


1 

















0.674 (0.669, 0.678) 


1 


1 


1 


1 








0.474 (0.437, 0.511) 


1 


1 


1 











0.468 (0.444, 0.492) 


1 





1 


1 








0.191 (0.177, 0.206) 


1 





1 











0.190 (0.184, 0.196) 








1 


1 








0.439 (0.418, 0.460) 








1 











0.521 (0.514, 0.528) 








2 


2 








0.137 (0.119, 0.155) 








2 


1 








0.092 (0.087, 0.098) 








2 











0.039 (0.036, 0.041) 




















0.617 (0.615, 0.619) 



If we consider a discrete time hazard model, the mortahty rate can be 
modeled as follows: 



where h is the mortality rate, and Zi is the indicator function of nomination 
and winning history in the full model, or the indicator function of nomination 
history in the reduced model. Then we can estimate the coefficients Oj based 
on the mortality rates calculated above. With this discrete time hazard 
model, we can calculate the log likelihood of the full model and the reduced 
model for each simulation round. Because 

—2{loglikelihood (Reduced Model) — loglikelihood (Full Model)) Xi2 

when the reduced model is true, we can obtain approximate p- values for the 
test of whether winning has an effect on mortality given nomination history. 
If the mean of p- values is significantly different from 0.5, then it shows that 
even if one has a correct model for the conditional hazard of death given all 
the measured time-dependent confounding factors, the model still provides 
a biased estimate of the effect of winning on survival. 

The mean of p-values over 1000 simulation round is 0.404, showing that 
there is bias. The histograms of p- values and test statistics are shown in 
Figure 1. 
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Table 7 

Mortality rates conditional on nomination history when 
nomination is affected by past winning history 

Mortality rates 



Death age 


iV(30) 


iV(60) 


iV(70) 


(95% CI) 


70 


2 







0.331 (0.324, 0.337) 


70 


1 







0.413 (0.410, 0.416) 


70 










0.380 (0.378, 0.381) 


80 


2 








0.495 (0.487, 0.503) 


80 


1 








0.668 (0.664, 0.672) 


80 


1 


1 





0.227 (0.222, 0.232) 


80 





1 





0.513 (0.506, 0.519) 


80 





2 





0.059 (0.057, 0.062) 


80 











0.617 (0.616, 0.619) 



In the above simulation setting, nomination history is both a confounder 
for winning history's effect on survival and has been affected by winning 
history. We now show that if nomination history is only a confounder and has 
not been affected by winning history, then the time-dependent Cox model 
that controls for nomination history produces correct inferences. We keep 
the same simulation set up as before, except that we change the selection 
weights for age group 60-69 in Table 3 to the selection weights in Table 8. 

We still restrict every candidate to be nominated at most twice and win 
at most twice. Based on 1000 Monte Carlo simulations, we obtained esti- 
mated mortality hazard rates for this full model in Table 9. For a reduced 



Histogram of pvalue Histogram of test. value 




0.0 0.2 O.-l 0.6 0.8 1.0 5 10 15 20 25 30 35 

pualue test.value 



Fig. 1. Histograms for p-values and test statistics from the likelihood ratio test of whether 
winning has an effect on mortality given nomination history based on the discrete time 
hazard model when the nomination is affected by past winning history. 
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Table 8 



Selection weight for age 60-69 when nomination 
history is not affected by winning history 





Previous winner 


Previous nonwinner 


Group 1 








Group 2 


8 


8 


Group 3 


9 


9 



model without winning history, the mortahty rates just adjusting the nom- 
ination history are shown in Table 10. From the probabilities in Tables 9 
and 10, conditioning on the same nomination history, winning does not have 
a significant effect on the mortality rates. 

Similarly, based on the discrete time hazard model, the mean of p-values 
in 1000 Monte Carlo simulations is 0.52, and the p- values and test statistics 
of likelihood ratio test are shown in Figure 2. The simulation illustrates 
that when nomination is not affected by the past winning history, a correct 
time-dependent hazard model does not suffer from the healthy performer 
survivor bias. 

Table 9 

Mortality rates conditional on nomination history and winning history when nomination 



IS not affected by past winning history 



Death 














Mortality rates 


age 


iV(30) 


A(30) 


Ar(60) 


A{60) 


Ar(70) 


A(70) 


(95% CI) 


70 


2 


2 












0.317 (0.267, 0.368) 


70 


2 


1 












0.327 (0.310, 0.343) 


70 


2 















0.337 (0.330, 0.344) 


70 


1 


1 












0.424 (0.411, 0.436) 


70 


1 















0.425 (0.422, 0.429) 


70 


















0.389 (0.388, 0.390) 


80 


2 


2 














0.514 (0.449, 0.578) 


80 


2 


1 














0.508 (0.486, 0.529) 


80 


2 

















0.494 (0.485, 0.503) 


80 


1 


1 














0.592 (0.574, 0.611) 


80 


1 

















0.575 (0.570, 0.580) 


80 


1 


1 


1 


1 








0.461 (0.418, 0.504) 


80 


1 


1 


1 











0.483 (0.454, 0.513) 


80 


1 





1 


1 








0.481 (0.464, 0.498) 


80 


1 





1 











0.480 (0.472, 0.487) 


80 








1 


1 








0.686 (0.674, 0.697) 


80 








1 











0.688 (0.683, 0.693) 


80 








2 


2 








0.428 (0.396, 0.459) 


80 








2 


1 








0.453 (0.440, 0.465) 


80 








2 











0.451 (0.444, 0.458) 


80 




















0.559 (0.558, 0.561) 
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Table 10 

Mortality rates conditional on nomination history when 
nomination is not affected by past winning history 



Mortality rates 



Death age 


iV(30) 


iV(60) 


iV(70) 


(95% CI) 


70 


2 







0.334 (0.328, 0.340) 


70 


1 







0.425 (0.422, 0.428) 


70 










0.389 (0.388, 0.390) 


80 


2 








0.499 (0.491, 0.507) 


80 


1 








0.577 (0.572, 0.581) 


80 


1 


1 





0.480 (0.474, 0.487) 


80 





1 





0.687 (0.682, 0.691) 


80 





2 





0.451 (0.445, 0.457) 


80 











0.559 (0.558, 0.561) 



3. Rank preserving structural accelerated failure time model. Robins 
(1986, 1992) and Robins et al. (1992) recognized ttie potential of conven- 
tional time-dependent proportional hazard models to provide biased esti- 
mates of causal effects when there are healthy performer survivor effects 
(Robins called these healthy worker effects). Robins (1986) was particularly 
concerned with occupational mortality studies in which unhealthy workers 
who terminate employment early are at an increased risk of death compared 
to other workers and receive no further exposure to the chemical agent under 
study. More generally, Robins has shown that the usual time-dependent Cox 
proportional hazards model approach might be biased when "(a) there exists 
a time-dependent risk factor for, or predictor of, the event of interest that 



Histogram of pvalue Histogram of test.vaiue 




0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 



Fig. 2. Histograms for p-values and test statistics from the likelihood ratio test of whether 
winning has an effect on mortality given nomination history based on the discrete time 
hazard model when the nomination is not affected by past winning history. 
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also predicts subsequent treatment and (b) past treatment history predicts 
subsequent risk factor level." In our context (a) nomination history is a time- 
dependent risk factor for death and a predictor of winning subsequent Oscar 
Awards, and (b) past winning history predicts future nomination. Robins de- 
veloped the rank preserving structural accelerated failure time model with 
g-estimation to eliminate bias from the time-dependent Cox proportional 
hazards model under conditions (a) and (b) above. We will adapt Robins' 
rank preserving structural accelerated failure time model and ^-estimation 
method. 

Our key assumption is as follows: 

Assumption 1 (Randomization assumption). Conditional on age, pre- 
vious nominations, and previous wins, the winner of an Oscar Award in each 
year is selected randomly among nominees for that award. 

We make no assumption about the nominees being randomly selected 
from the pool of actors and actresses, only that the winner is randomly cho- 
sen (conditional on covariates) among the nominees. Indeed, some pundits 
suggest that being nominated for an Oscar Award is due to talent, whereas 
winning one is due to luck [Sylvestre, Huszti and Hanley (2006)]. Gehrlein 
and Kher (2004) provide further discussion of Oscar Award selection proce- 
dures. 

3.1. Basic setup. We focus on the causal effect of winning an Oscar 
Award for the first time on a performer's survival, and do not consider any 
additional effect of multiple wins here. We focus only on comparing winners 
to nonwinning nominees. 

To simplify our discussion, we use candidate to denote a candidate j 
who has been nominated for the ith Oscar Award. There are a total of 300 
Oscar Awards in our data, so i = 1, 2, . . . , 300. We assume the existence of a 
latent or potential failure time variable Uij, which represents the potential 
years candidate {i,j) would live after the award date if he or she did not win 
an Award on date i nor in the rest of his or her lifetime. However, we only 
observe the observed failure time variable Tij, which means the observed 
years candidate lives after the award date until his or her death. We 
will assume that the Tjj are uncensored until Section 3.4, where we will 
consider censoring. 

3.2. Rank preserving structural accelerated failure time model. The rank 
preserving structural accelerated failure time model (RPSAFTM) assumes 
that winning an Oscar for the first time multiplies a performer's remaining 
lifetime by a treatment effect factor exp(— -0). The parameter ■0 is the addi- 
tive effect of winning on the log of a performer's remaining lifetime after the 
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award. A positive ijj means winning decreases lifetime, a negative V nieans 
winning increases lifetime and tp = Q means winning has no effect. See Cox 
and Oakes (1984) and Robins (1992) for more discussion of the accelerated 
failure time model. 

For the RPSAFTM, the potential failure time Uij can be calculated from 
the observed failure time Tjj as follows. Let Fij be the first time candidate 
(z, j) won an Oscar Award (Fjj- = oo if the candidate never won an Award), 
and Di be the date of the ith Oscar Award. Let set A contain candidates who 
never won an Oscar Award in their whole lifetime, set B contain candidates 
who won Oscar Awards at least once and for whom Fi^j < Di, and set C 
contain candidates who won Oscar Awards at least once and for whom 
Fij > Di. We have 



As an example, consider Marlon Brando who was born on April 3, 1924, 
and died on July 1, 2004. Brando was nominated for an Oscar for the first 
time on March 20, 1952 (i = 77), but did not win the Award. He won two 
Oscar Awards in his career: the first time on March 30, 1955 {i = 89) and 
the second time on April 27, 1973 (i = 161). His information is listed in 
Table 11. 

U7r,BW = (30Mar55 - 20Mar52) + exp(V')(lJul04 - 30Mar55), 
C/gi b(^) = (30Mar55 - 19Mar53) + exp(V')(lJul04 - 30Mar55), 
[/g5 b(^) = (30Mar55 - 25Mar54) + exp(V')(lJul04 - 30Mar55), 
C/gg b(^) = exp(V')(lJul04 - 30Mar55), 

C/ioi,b(V') = lJul04 - 26Mar58, 

C/i6i,b(V') = lJul04 - 27Apr73, 

C^i65,BW = lJul04-2Apr74, 

f^23i,B(^) = lJul04 - 26Mar90. 

The subscript "B" represents Marlon Brando. Note that in the RPSAFTM 
(1), Brando's multiple wins have no additional effect on his survival beyond 
his first win. 

3.3. Test of treatment effect on survival. Although the latent failure time 
variable Ui^j can be calculated based on the treatment effect factor ip, ij: \s 
still an unknown parameter that we need to estimate. The basic idea for 
testing the plausibility of a hypothesized treatment effect under Assump- 
tion 1 is the following: if the hypothesized treatment effect is correct, the 




if candidate {i,]) ^ Au B, 



if candidate € C. 
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Table 11 
Marlon Brando's nominations 



Nomination date 


Number of award (i) 


Award 


Win 


20Mar52 


77 


Best Actor 


N 


19Mar53 


81 


Best Actor 


N 


25Mar54 


85 


Best Actor 


N 


30Mar55 


89 


Best Actor 


Y 


26Mar58 


101 


Best Actor 


N 


27Apr73 


161 


Best Actor 


Y 


2Apr74 


165 


Best Actor 


N 


26Mar90 


231 


Best Supporting Actor 


N 



latent failure times in the treatment (winning) and control (nonwinning) 
groups should be similar, but if the hypothesized treatment effect is too 
large (small), the latent failure times in the treatment group will tend to be 
smaller (larger) than those in the control group. 

To explain the details, let Aij denote the treatment status for candida- 
te (i,j): 

—/■'■' candidate wins the ith award, 

^'-^ I 0, if candidate («,j) loses the ith award. 

Note that Ai^j is only defined if j was nominated for the ith award. Let Wi^j 
denote the vector of candidate (i,j)'s covariates, such as age at time of 
nomination, number of previous nominations, and number of previous wins, 
etc. Note that some of the covariates in Wij can be time dependent. 

Let Uij{'ipQ) denote the latent failure time if tpo is the true treatment 
effect; Uij{^o) can be calculated from (1). Consider a logistic regression 
model for the probability that candidate wins award i conditional on 

Wij and Uij{ipo): 

PiA, = i\w^,,Uiji^o)) 

(2) 

where (3 and 9{^q) are unknown parameters. We use conditional logis- 
tic regression for estimating (2), where we condition on there being one 
winner among the nominees for each award. Only the nominees for each 
award are considered in the conditional logistic regression, that is, the can- 
didates included in the regression are (i, ji ),..., (i,j„.), where z = 1, . . . ,300, 
and ji , . . . , jn- are the nominees for the ith award (nj = 5 except for some 
early awards). See the last two paragraphs of this section for discussion 
of a modification of this conditional logistic regression that improves ef- 
ficiency. Model (2) combined with conditioning on there being one win- 
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ner for each award is equivalent to the model that the winner of award i 
is determined according to McFadden's (1974) choice model where (Wj^^, 
Uij^ (ipo)), . . . , {Wij^^, Uij^Xi^o)) the covariates that describe the choices 
for the award. 

For the true the coefficient O^tp) on Uij{'ip) in (2) should equal zero. 
This is because under Assumption 1, conditional on the covariates Wij's of 
the nominees for an award, the latent failure times ?7ij's of the nominees are 
independent of which nominee wins the award, that is, 

P{Aij = l\Wij,U^j) = P{Aij = l\W^j). 

We test the null hypothesis that ip equals a particular value ipo by seeing 
whether a score test accepts or rejects the null hypothesis that the true value 
of 9{^) is 0. In other words, we test 

Hio ■.ip = ipo vs. His, : "0 / -00 

by testing 

i^2o:^(^o)=0 vs. F2a:e(^o)/0. 

Rejection of H20 implies rejection of Hiq, and acceptance of H20 implies 
acceptance of -ffio- We invert this test to find a confidence interval for ip, 
that is, the 95% confidence interval consists of all ipQ for which we do not 
reject H20. 

We now discuss an efficiency issue for testing ip = ipQ. If a candidate (i, j) 
has already won an award before the date of the ith Oscar Award, then 
Tij = Uij regardless of whether the candidate wins the award at the date of 
the zth Oscar Award. Candidate contributes no information for testing 
tp = ipo since Uij{ipo) is a constant function of ipQ. Consequently, it is more 
efficient for testing ip = ipQ to not include candidates (i, j) in the analysis 
who have already won an award before the date of the ith. Oscar Award. 
In fact, we found that for the Oscar data, the confidence interval based on 
excluding candidates who have already won an award was 20% shorter than 
the confidence interval based on including the already winners. 

As an example of excluding the already winner candidates, for Marlon 
Brando, we do not include (101, B), (161, B), (165, B), (231, B) because 
Brando won the 89th Oscar Award (see Table 7). Because we estimate (2) 
using conditional logistic regression in which we condition on the number of 
winners for each award, by dropping candidates who have already won 
an award before award i, we effectively drop all data from awards in which 
the winner had already won an award before. 
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3.4. Censoring case. If the lifetimes for all candidates were observed and 
Assumption 1 holds, the above analysis would provide consistent tests for the 
treatment effect. However, if some of the lifetimes are censored and we treat 
the censored lifetime as the observed lifetime, there will be a violation of 
Assumption 1. Let Cij denote the censoring time of candidate For our 
data, Cij = July 25, 2007 for all i,j. Instead of observing the failure time Tij 
of how long candidate j lives after the date Di of award i, we observe the 
censored failure time Xij = mm{Tij,Cij — Di). Consider the variable U*j{'tp) 
that is generated by substituting Xij for Tij in the RPSAFTM (1) to cal- 
culate Uij. If 7^ 0, then Uij{ip) is not independent of Aij given Wij. To 
illustrate this, we provide the following example. Suppose there is a posi- 
tive treatment effect for winning an Oscar Award on performers' survival. 
Consider a candidate A who just won once in his whole career. Suppose he 
won on date D. Assume his actual remaining lifetime after D is T. If there 
is a positive treatment effect, his latent failure time value will be U where 
U <T. When the censoring time C satisfies U < C — D <T, the correspond- 
ing U^ip) generated by substituting C - D for T in the RPSAFTM wih be 
smaller than U for the true ip. Now consider a candidate B who has the same 
latent failure time U and the same censoring time C as candidate A, but 
who never won any awards. For candidate B, we have U*{tp) = U . Hence, for 
these two candidates with identical C/'s, winning is associated with U*{ijj). 
In summary, when there is a positive treatment effect, winning an Oscar 
Award will prolong performers' lifetime, making latent failure times more 
likely to get censored compared to nonwinning nominees, and causing bias 
if censored failure times are treated as actual failure times. 

In the above example, if we want to have the same censored latent failure 
time for both winning and losing performers who have the same actual latent 
failure time, we can modify the censoring time for the losing performer to 
be before the actual censoring time so that U*{ip) will be censored in the 
same way regardless of whether a performer wins or loses. This is Robins et 
al.'s (1992) idea of artificial censoring. 

We define an observable variable U**{ipo) that is a function of {Uij{ipo), 
Aij) and use it as a basis for inference concerning ipQ. U*j'{'ipo) is defined 
by censoring Uij{'4)o) at the artificial censoring time Cij{'4)o) that is defined 
below. 

Recall that Fij is candidate {i,jys first win time, and Di is the date of ith 
Oscar Award. 
When Fij > Di, 

Cij(^o) = min((Cij - A), (Cjj - A) exp(V'o))- 
When Fij < Di, 

Cij{ijjo) = Cij - Di. 
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Table 12 
Simulation results 



Type of analysis 


Status 


Time-zero 


Mean of p-value 


PH 


Static 


Birthday 


0.03 


PH 


Dynamic 


Birthday 


0.12 


PH 


Dynamic 


Nomination day 


0.12 


PY 


Dynamic 


Nomination day 


0.04 


RPSAFTM 


Dynamic 


Nomination day 


0.49 



Then C/**(V'o) = min(C/ij(Vo),Cij(V'o)_). We substitute ;7**(Vo) for 
Ui^j{ipQ) in the conditional logistic regression model (2), and test the nuh 
hypothesis ^(V'o) = 0. Note that U**{ipo) could be any observable function 
of Uij{ipo),Cij{ijjo), not just mm{Uij{ipo),Cij{ilJo)). Robins (1993) describes 
the semiparametric efhcient such function. 

3.5. Simulation results. In Section 2.4 our simulation study showed that 
previous studies suffered from healthy performer survivor bias. Here we will 
use the same setup to test the RPSAFTM. Recall that a correct analysis 
method should produce approximately uniformly distributed p- values in the 
simulation study. The results in Table 12 are from 1000 simulations. We 
have shown the first four rows from the simulations in Section 2.4 (Table 4), 
and add the last row for the RPSAFTM. 

Figure 3 contains histograms for p-values of the five methods from 1000 
simulations. 



Histogram of Cox Model,Static,Birthday Histogram of Cox Model, Dynamic, Birthday 




0.2 4 0.6 a 1 .0 0.0 01 0.2 3 0.4 0.5 

pvalue pvalue 

Histogram of RPSAFTM, Dynamic, Nomination Day 



00 02 04 06 03 1 

Fig. 3. Histograms for p-values from the test of whether winning has an effect on mor- 
tality based on the four methods of previous studies and the RPSAFTM introduced by the 
current paper. 
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In the first four plots, the majority of the p-values are smaller than 0.2, 
while in the last plot, the p- values are uniformly distributed. The RPSAFTM 
corrects the survivor treatment selection bias that previous methods suffer 
from. 

4. Analysis of Oscar Award data. We have compiled a data file that 
records the nominees and winners for each award (best lead actor, best 
lead actress, best supporting actor, best supporting actress) on each Os- 
car Award date. We collected the data from www.imdb.com. The data is 
in the supplementary materials [Han et al. (2010)]. The selection inter- 
val spanned from the inception of the Oscar Awards to July 25, 2007. In 
computing lifetime since being nominated, we use the actual Oscar Award 
date which varies from year to year. People who were not reported dead on 
www.imdb.com were presumed to be alive. There are 260 winners and 564 
nonwinning nominees, 824 performers in all. Of these 824 performers, 448 
are censored. 

We did not include several candidates in our data set. Margaret Avery 
was nominated for best supporting actress in 1985, but we could not find 
her birthday and day of death from the internet. We did not include the 
following candidates who died before the winner of the award for which 
they were nominated was announced: Massimo Troisi, Jeanne Eagels, James 
Dean, Spencer Tracy, Peter Finch, and Ralph Richardson. 

We have shown results from previous studies, which are based on less 
years of Oscar data than ours, in Table 1. To compare previous studies 
with ours, we have applied the methods of previous studies to our updated 
Oscar Award data set; the results are shown in Table 13. Compared with the 
results in Table 1, the reductions in mortality rate in Table 13 are smaller. 
The confidence intervals are also narrower, because we have 7 years more 
candidates than the original data set, and also each candidate in our data 
set has 7 years more information. 

Table 13 
Winners vs. nominees 

Reduction in 
mortality rate 



Type of analysis 


Status 


Time-zero 


(95% CI) (%) 


PH 


Static 


Birtliday 


19 (6 to 31) 


PH 


Dynamic 


Birthday 


9 (-6 to 22) 


PH 


Dynamic 


Nomination day 


14 (0 to 26) 


PY 


Dynamic 


Nomination day 


10 (-6 to 23) 


PH2 


Dynamic 


Nomination day 


8.7 (-7.3 to 24.7) 
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Table 14 
Summary of conditional logistic model 





coef 




exp(coef) 


se(coef ) 


z 


p- value 


t/**(0) 


1.37e- 


■02 


1.01 


0.007541 


1.812 


0.07 


nomage 


5.36e- 


■02 


1.06 


0.101676 


0.527 


0.60 


nomage .square 


-9.18e- 


04 


1.00 


0.002278 


-0.403 


0.69 


nomage. cubic 


7.40e- 


-06 


1.00 


0.000016 


0.462 


0.64 


numprenom 


6.99e- 


■02 


1.07 


0.071407 


0.979 


0.33 



In Table 8 the first four rows are based on previous methods. We also add 
the fifth row, which corresponds to a Cox time-dependent model adjusting 
for past nomination history and winning history; nomination history is ad- 
justed for by conditioning on the number of previous nominations. Note that 
previous methods did not consider the nomination history. 

We now consider fitting the RPSAFTM. For the conditional logistic re- 
gression (2), we use the following time dependent covariates Wij: age of 
nomination (nomage), square of age of nomination (nomage. square), cube 
of age of nomination (nomage. cubic), and number of previous nominations 
(numprenom) . Table 14 shows the results of the conditional logistic regres- 
sion model (2) when -0 = 0. 

The p- value for the test of whether the coefficient on U^j'^O) is 0, that is, 
the test of H20 ■ 0{O) = vs. i?2a : 0(0) + 0, is 0.07. Thus, we do not reject the 
null hypothesis that winning an Oscar has no effect on a performer's survival 
at the 0.05 level. Looking at the effect of the other covariates (the Wjj) in 
Table 14, there is not strong evidence that number of previous nominations 
has an effect on the probability of a performer winning. For age at time of 
nomination, although the p- values on each of the polynomial terms are not 
significant, a test that the coefficient on all three of the terms is zero gives 
a p- value of 0.03 so age at time of nomination does appear to affect winning. 
Older nominees are slightly more likely to win. 

The validity of our test of the effect of winning an Oscar depends critically 
on correctly controlling for the effect of age at time of nomination on winning 
since this age is clearly correlated with Uij (older nominees generally live 
a shorter time after the award date, so have smaller C/jj's). To check that 
our results are robust to different ways of controlling for age at time of 
nomination, we replaced the cubic polynomial in nomage in Table 10 with 
a cubic spline of nomage with 1 to 4 knots placed at equally spaced quantiles. 
The p- values for the test of : 61(0) = vs. R^a ■ 0{O) + ranged from 0.064 
to 0.07 in these analyses. Thus, our result that there is not evidence that 
winning has an effect on survival at the 0.05 level is robust to how nomage 
is controlled for. We will use the cubic polynomial for nomage in Table 14 
in our subsequent discussion. 
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Table 15 

95% confidence interval for treatment effect 



Treatment effect 


CI 




[-0.2360,0.0088 


Winning multiplies survival 




exp{-ip) 


[0.9912, 1.2662] 



Table 15 shows the 95% confidence interval for the treatment effect. Our 
95% confidence interval is that the effect of winning is in the range of de- 
creasing survival (after the award date) by 0.88% to increasing survival by 
26.62%. 

Robins' (7-estimate for the treatment effect is the ipQ that makes 9{'tpQ) = 
in the conditional logistic regression (2). This ^l^Q maximizes the p- value for 
testing H20 : 9{iJo) = vs. ■ 6'(V'o) / 0. Robins et al. (1992) show that the 
g-estimate is asymptotically normal and consistent. The (7-estimate can also 
be viewed as the Hodges-Lehmann (1963) estimate of the treatment effect 
based on the test of H20 '■ Oii^o) = 0. 

We search for possible values of V'o with 0{iPq) = in the range [—0.2360, 
0.0088] with step size = 0.0001. Figure 4 shows the estimates 0(V'o) ai^d the 
p-values for testing H20 ■ 6{ipo) = 0. 6{ip) is a monotone increasing function 
of ip in [—0.2360,0.0088]. The (/-estimate is tp = —0.1127, which corresponds 
to winning increasing survival by 12%. To estimate the survival advantage 
for winners in terms of years, we consider the performers who won the first 
time they were nominated. For these performers, we find their censored 
latent failure time C/jj*(^) under the assumption that the point estimate ip 
of ip is the true treatment effect. Then we make Kaplan-Meier estimates for 




-0.20 -0,15 -0,10 -0.05 0.00 -0.20 -0.15 -0.10 -0,05 0.00 



Fig. 4. Estimate of the coefficient 6 of the modified potential failure time variable Uif (tp) 
in the conditional logistic regression (2) for different treatment effect value and p-values 
from the test of whether 6{'4>) equal zero for different tp. 
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the distribution of the actual survival times for these winners and for the 
distribution of the latent survival times if these winners had never won. The 
difference between the estimated medians of these two distributions is an 
estimate of the survival advantage of winning the award for these winners. 
In the current Oscar Award data, we estimate the survival advantage to be 
4.2 years, with a 95% confidence interval of [—0.4, 8.4] years. 

4.1. Diagnostic plots. To examine whether the RPSAFTM is appropri- 
ate for the Oscar Award data set, we use boxplots to check if the randomiza- 
tion assumption (Assumption 1) is violated for latent failure times computed 
according to the RPSAFTM at our point estimate tp ip. This is similar to 
the diagnostics for testing an additive treatment effect model in Small et al. 
(2006). Based on the randomization assumption, for the point estimate i/j, 
the distributions of U^^{ip) should be approximately the same for the treat- 
ment group (winners) and the control group (nonwinning nominees) in the 
same range of nomage. We divide the candidates into five subgroups based 
on the quantiles of nomage. For each subgroup, we make boxplots for U*^{'iIj) 
for the winners and the nonwinning nominees. Figure 5 shows the distribu- 
tion of U*j*{'ip) is similar among winners and nonwinning nominees for each 
range of nomage. This supports the validity of the RPSAFTM (assuming 
that Assumption 1 is valid). 



nomage<=30.23 




30.23<nomage<=35.51 



35.51 <nomage<=41 .2 




41 .2<nomage<=50.64 



50.64<notT)age<=87.78 



Fig. 5. Boxplots ofU*j{^lj) for comparison between treatment group (winners) and con- 
trol group (nonwinning nominees) in five subgroups based on the quantiles of nomage. 
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Table 16 
Sensitivity analysis 



exp(lO0*) = 
Odds ratio for two 
otherwise equal people 
one has 10 years 
higher 17*/ than other 


e* 


Confidence interval for tp 


Survival advantage 
in terms of years 
point estimate/ 

confidence interval 


0.5 


-0.0693 


(-0.6587,-0.4174) 


16.4/(13.6,19.3) 


0.6 


-0.0511 


(-0.5652,-0.3235) 


14.1/(11.0,17.2) 


0.7 


-0.0357 


(-0.4769,-0.2374) 


11.7/(8.4,15.1) 


0.8 


-0.0223 


(-0.3911,-0.1550) 


9.3/(5.7,12.9) 


0.9 


-0.0105 


(-0.3100,-0.0730) 


6.8/(2.8,10.6) 


1 





(-0.2360,0.0088) 


4.2/(-0.4,8.4) 


1.1 


0.0095 


(-0.1654,0.0879) 


1.4/(-3.7,6.1) 


1.2 


0.0182 


(-0.0940,0.1697) 


-1.5/(-6.9,3.6) 


1.3 


0.0262 


(-0.0210,0.2515) 


-4.4/(-10.9,0.8) 


1.4 


0.0336 


(0.0413,0.3359) 


-7/(-16.4,-1.3) 


1.5 


0.0405 


(0.0985,0.4238) 


-10.3/(-19.2,-4.2) 



4.2. Sensitivity analysis. Our basic assumption, Assumption 1, is that, 
conditional on covariates such as age at nomination, and number of pre- 
vious nominations, who wins the Oscar Award is not related to how long 
the candidates would have lived without winning an award. This could be 
violated if performers who lead a more healthy lifestyle are more likely to 
win or if performers who lead a more reckless lifestyle are more likely to win. 
We now provide a sensitivity analysis to violations of Assumption 1. Under 
Assumption 1, 6{^jJ) is 0. If Assumption 1 is violated, then 9{ip) = 6* ^ 0. 
For 9{il)) = 0*, we can test the plausibility of -00 by testing i/go : ^(V'o) = 9* 
vs. : 9{il)o) ^0*.To cahbrate 9*, we note that we can interpret exp(lO0*) 
as the odds ratio for one candidate to win compared to another, if the one 
candidate has a ten year higher latent failure time than the other and the 
two candidates are the same age at nomination and have the same number of 
previous nominations. Under Assumption 1, exp(lO0*) = 1. Table 16 shows 
confidence intervals for ip and the survival advantage of winning for winners 
at first nomination for different values of 9*. 

As the odds ratio exp(10^*) increases from 0.5 to 1.5, the point esti- 
mate of the survival advantage decreases from 16.4 years to —10.3 years. 
If less healthy candidates are moderately more likely to win than healthy 
candidates, exp(lO0*) = 0.9, then the confidence interval only contains neg- 
ative ip, and there is strong evidence that winning increases survival. But if 
more healthy candidates are somewhat more likely to win than less healthy 
candidates, exp(lO0*) = 1.2, then the confidence interval contains predomi- 
nantly positive ip and the point estimate is that winning decreases survival. 
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5. Discussion. In this paper we point out that healthy performer survivor 
bias exists in methods from previous studies of the effect of winning an Oscar 
on survival. We show that under Assumption 1 (among nominees, the winner 
is randomly selected conditional on baseline covariates), Robins' RPSAFTM 
eliminates healthy performer survivor bias. We estimated that the effect of 
winning an Oscar Award on survival for winners at first nomination is to 
increase survival by 4.2 years, but the 95% confidence interval of [—0.4,8.4] 
years contains negative effects. Thus, our study indicates that there is not 
strong evidence that winning an Oscar increases life expectancy. 

The analysis in this paper is a case study of how Robins' RPSAFTM can 
provide an improvement over Cox proportional hazards models for estimat- 
ing the effect on survival of a sudden change in a person's life, for example, 
becoming ill, starting a high risk behavior, or starting a treatment. A key 
assumption (our Assumption 1) that is needed to obtain inferences from the 
RPSAFTM is that, conditional on covariates recorded up to a given time, the 
sudden change is "randomly" assigned. A feature of our application, unlike 
most other applications of RPSAFTMs [e.g., Robins et al. (1992); Hernan et 
al. (2005)], is that we only assume the sudden change is randomly assigned 
among a select subset of the people in the study rather than all people in the 
study. In particular, we are only assuming that among nominees in a given 
year, who are generally at least somewhat healthy in the given year, the win- 
ner is randomly selected. We are not assuming that the winner is randomly 
selected from the pool of all actors and actresses who have been nominated 
in a previous year or the given year and are still alive. Some performers nom- 
inated in a previous year might be too unhealthy to act even though they 
are still alive. Similar consideration of comparability only among a selected 
subset can be found in Joffe et al. (1998) and Robins (2008). 

In the RPSAFTM, model (1) is rank preserving, that is, the effect of 
winning is the same for each subject. Robins et al. (1992) and Lok et al. 
(2004) discussed an expanded class of SAFTMs, which does not need the 
RHS of (1) at the true ip to be equal to the actual counterfactual failure 
time Uij, rather it just needs that the RHS and the Uij have the same 
distribution conditional on past measured covariates sufficient to control 
confounding. This eliminates the assumption of rank preservation without 
changing the method of estimation of the population (i.e., distributional) 
interpretation of ip. 
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SUPPLEMENTARY MATERIAL 

Supplement A: Oscar Award data for actors and actresses 
(DOL 10.1214/10-AOAS424SUPPA; .dat). We have compiled a data file that 
records the nominees and winners for each award (best lead actor, best lead 
actress, best supporting actor, best supporting actress) on each Oscar Award 
date. We collected the data from www.imdb.com. The selection interval 
spanned from the inception of the Oscar Awards to July 25, 2007. 

Supplement B: R code for data analysis and simulation 

(DOL 10.1214/10-AOAS424SUPPB; .zip). We provide the R code for our 
data analysis and simulation studies. File "R code.txt" is for preprocessing 
the Oscar data and data analysis in Section 4. File "simulation l.txt" is for 
the simulation studies in Sections 2.4 and 3.5, especially for Tables 4, 12, and 
Figure 3. File "simulation 2.txt" is for the simulation studies in Tables 5-10 
and Figures 1 and 2. 
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